\section{Data}
\label{sec:data}

Due to the development in ICT and networks enormous amounts of data are generated. This data is generated from transactions, emails, videos, audios, images, click streams, logs, posts, search queries, health records, social networking interactions, science data, sensors, and mobile phones and their applications \citep[p. 42]{SAGIROGLU}. Human activities created 5 exabytes of data until 2003 but with the current technology developments this amount of data is created in 2 days. In 2012, the volume of data expanded to 2.72 zettabytes and it is predicted that this data will double in every two years and will expand to 8 zettabytes by 2015 \citep[p. 3]{HADOOP}. Although it is a new resource, data in itself does not constitute value. Data should be processed into a useful pattern to get some value out of it. Big data and analytics can turn this vast amount of data into valuable and usable information and knowledge \citep[p. 2]{CISCO}.

\subsection{Big data, Open Data and Open Government}
\label{subsec:bigdataopendataopengovernment}
Big data, open data, and open government are three different phenomena that can influence the development in different sectors (social, economical and institutional). Before getting into the core discussion about the relationships between these phenomena it is important to understand the concepts, at least from the basic definitions.

\subsubsection{Big Data}

\begin{quote}
\textit{``Big Data refers to huge data sets that are orders of magnitude larger (volume), more varied and complex (variety), and generated at a faster rate (velocity) than your organization has had to deal with before.''} \citep[p. 3]{HADOOP}
\end{quote}

In general Big data is a term used to describe very large, complex, and rapidly changing datasets. But this is a subjective and technology dependent explanation. So, in the future with the advancement of technology and data analytics, today's big data may not be conceived as big.

\subsubsection{Open Data}

The notion of open data has been around for some years  but it appeared in the mainstream when governments like UK, USA, Canada and New Zealand announced new initiatives for opening their public information \citep[p. 3]{HANDBOOK}. But before getting much deeper in to the topic, it is important to understand what actually open data is. The \textit{Open Data Handbook Documentation} by \cite{HANDBOOK} provides a clear definition of open data:

\begin{quote}
\textit{``Open data is data that can be freely used,  re-used and redistributed by anyone - subject only, at most, to the requirement to attribute and sharealike.''} \citep[p. 6]{HANDBOOK} 

\end{quote}

\noindent According to \cite{HANDBOOK}, data to be open should have following characteristics:

\begin{description}

\item[Availability and Access:] \textit{the data must be available as a whole and at no more than a reasonable reproduction cost, preferably by downloading over the internet. The data must also be available in a convenient and modifiable form.}

\item[Re-use and Redistribution:] \textit{the data must be provided under terms that permit re-use and redistribution including the intermixing with other datasets.}

\item[Universal participation:] \textit{everyone must be able to use, re-use and redistri\-bute- there should be no discrimination against fields of endeavour or against persons or groups. For example, `non-commercial' restrictions that would prevent `commercial' use, or restrictions of use for certain purpose (e.g. only in education), are not allowed.}

\end{description}

\noindent \cite{GURIN} provides a very simple and effective definition of open data by saying,

\begin{quote}

\textit{``Open Data can best be described as accessible public data that people, companies, and organizations can use to launch new ventures, analyze patterns and trends, make data-driven decisions, and solve complex problems.''}

\end{quote}

\noindent There is disagreement regarding whether open data is only be public data or if it covers the openness of data in general regardless whether it is coming the public or the private sector. Therefore it makes sense to distinguish between the concepts of open data and \textit{open government data}.

\subsubsection{Open Government}
\label{subsec:opengovernment}

Open government is means to be a way to increase the overall government transparency. This openness in government \textit{``provide high-value information, including raw data, in a timely manner, in formats that the public can easily locate, understand and use, and in formats that facilitate reuse'' } \citep[p. 181]{YU}. The \textit{ambiguity} of open government makes the term Open Government Data (OGD) unclear, giving it a multi-meaning: one is politically important disclosures and the other one is \textit{``data that is both easily accessed and government related, but that might or might not be politically important ''} \cite[p. 182]{YU}. Examples of OGD include large public-government datasets (weather data, GPS, Census, healthcare, etc.). \bigskip

\noindent \cite{BATES} argues that Open Government Data (OGD) has a progressive impact on the society and is positioned as a socio-economic initiative. Various countries have developed different OGD initiatives. These initiatives can range from community-led to World Bank sponsored, government led, and civil society initiated \citep[p. 1]{DAVIES}. The UK government is focused on using OGD initiatives to leverage its \textit{marketisation} of new services \citep{BATES}. Similarly, more than 50 nations have participated in a partnership program called Open Government Partnership (OGP) whose objective is to promote good governance and strengthen democracy by increasing transparency, citizen participation, discussion with civil society, anti-corruption, accountability, and the use of technology and innovation \citep[p. 2]{DENMARK}. The Danish \textit{Open Government National Action Plan 2013-2014} focuses \textit{on the use of new technology to strengthen transparency, growth, and the quality of life. It also focuses on a new approach to the role of the public sector where [the Danish Government] is to work on active and broad involvement of citizens, companies, and civil society in general''} \citep [p. 4]{DENMARK}. \bigskip

\noindent \cite{JANSSEN} describes the relationship between OGD development with \textit{right-to-information} (RTI) movements. She argues that RTI and OGD are closely related but have different focuses and priorities. As described by \cite{JANSSEN}, OGD is mainly focused on \textit{``innovation and economic growth on the one hand, and efficiency of the public sector on the other.''}, whereas RTI is mainly focused on promoting \textit{``access to government information as a fundamental right''}. A similar subject is the re-use of Public Sector Information (PSI). To embrace the essence of OGD, EU developed a policy on the re-use of PSI called the PSI directive \citep{JANSSEN2,PSI}. The PSI directive is economically inclined with no aims on regulating the matters related to access to government information. \cite{JANSSEN2} explains the blurry relationship between re-use of PSI and RTI  as one of the reasons for reluctance in opening government data. So \cite{JANSSEN2} put emphasis on the necessity of including RTI and re-use of Public Sector Information (PSI) in the development open data policies. \bigskip

\noindent We have already argued that a proper open government strategy promotes transparency and trust among citizens as well as the private sector. This leads to open governance strategies, which are concerned with the creation of proper institutions and effective rules and procedures for improving public service delivery \citep{OECD}.  \cite{OECD} defines governance in terms of relationships and thus views other public, private, and voluntary sectors as parts of general governance. In other words, open governance assures the participation of other sectors (citizens, private, and voluntary) and provides access to government information in order to engage its surroundings more effectively and respond to associated actors actively. Open governance strategies promote collaborative approaches where all the relevant sectors have the potential to be involved. Mutual understanding and sharing of information between these sectors leads to enhanced results.

\subsection{Relation between Big Data, Open Data, and Open Go\-vern\-ment}
\label{subsec:datatypes}

\begin{figure}[H]
  \centering
    \includegraphics[width=0.8\textwidth]{./Pictures/opendataandbigdata}
    \caption{\textit{Relation between Open Data, Big Data and Open Go\-vernment } \cite[p. 253]{GURIN}}
    \label{fig:relationopendataandbigdata}
\end{figure}

\noindent Figure \ref{fig:relationopendataandbigdata} depicts the relationship between big data and open data, and how they relate to the broad concept of open government. \cite{GURIN} has highlighted some of the important aspects of big data and open data and argues that open data and big data  \textit{``are very different in philosophy, goals, and practice''}.

\begin{itemize}
\item Big data is not democratic until and unless it is opened.
\item Open data does not follow the same rule as big data i.e amount of data does not matter to be open.
\item Big data can be data that is generated unintentionally and usually, data sources are passive whereas open data is generated with some purpose.
\item Data is kept private in big data because of private or business reasons whereas data is made public in case of open data.
\item When big data is turned in to open data, it is powerful and much more benefits can be realized. National weather data and GPS data provide an example of successful overlapping between open data and big data.

\end{itemize}

\noindent As to the relationship between open data and open government, open government take the advantage of open data by making the OGD machine readable and accessible and by that promoting transparency and accountability of the government \citep[p. 192]{YU}. Similar to big data, the public sector can actually embrace the concept of open data and realize the full potential and benefits of the government data.

\subsection{Open data as an innovation enabler}
\label{subsec:opendatainnovationenabler}

Open data has been identified as the worlds greatest free resource and it will drive the future economy. From different literature it is also clear that the necessity of opening up of data to build a better and sustainable society has been realized more or less worldwide \citep{GURIN, AGUSTI}. \cite{YU} identifies open data as a collaborative innovation approach and explains it by saying,

\begin{quote}

\textit{``when many individuals or groups are able to access information themselves and interact with it on their own terms (rather than in ways prescribed by others), significant benefits can accrue. Each of these movements are focused on certain classes of information, and each one leverages new technology to make that information more freely available and useful.''} \citep[p.188]{YU}

\end{quote}

\noindent This also clearly depicts that open data leads to innovation that boosts development in different areas. Open data can be a powerful tool for unleashing new business and economic opportunities, addressing societal challenges, accelerating scientific progress, and creating a need to act at local, regional, national, or continental level \citep{EU}. Furthermore, opening up of data by an organisation or company helps to increase the economic benefits outside the organisation as well as inside the organisation because it makes it possible to cut some activities or handle them in more efficient manners and decrease transaction costs \citep[p. 18]{FIORETTI}. On the basis of high quality survey data collected from 138 Swedish IT-Entrepreneurs, \cite{LAKOMAA} present the importance of open data as an enabler and entrepreneurial activity and innovation\citep[p. 561]{LAKOMAA}. Open data can:

\begin{itemize}

\item simulate potential viability to ensure funding.
\item provide information about potential market.
\item reduce Development lead time to application market.
\item drive innovation beyond applications.
\item enhance existing online services and offerings.

\end{itemize}

\noindent They conclude that opening up of data provides tangible and direct pay off by increasing entrepreneurial activity and enhancing business services while not disseminating open data may lead to lost innovation and devalue new business plans \citep[562]{LAKOMAA}.

\subsection{Open data for sustainable development in smart cities}

Open data has been realized as an enabler of innovation. It has been also viewed as a resource for innovation, growth, and transparent governance, which has a potential of leading Europe's economies to a high and a sustainable growth path \citep{EU}. Open data can support sustainable development by untapping new businesses and economic opportunities, addressing social challenges, and accelerating scientific progress \citep[pp. 3-4]{EU}. The opening up of public sector information (PSI) alone can generate an estimated revenue of 40 billion Euro \citep[p. 3]{EU}. Also, the societal value of open data increases its importance in a smart city context. Additionally, open data can also provide government transparency (as mentioned earlier) \cite[p. 17]{FIORETTI}. \textit{``[This] brings people closer to their representatives''} \citep[p. 5]{HOWARD} and may also endow them with the opportunity of participate in governmental decision making processes. \bigskip

\noindent By making government operations transparent and by involving people in governance processes solve complex problems faster and more effectively by harnessing the power of collaborative approach of the city actors. This can increase the quality of service and reduce investments of public resources \citep[p. 150]{LEE}. Furthermore, \cite{HOWARD} explains how open data can support sustainability in the context of smart city: open data can build trust, create accountability, build business, and create urban  analytics such as for crime \citep[pp. 6-7]{HOWARD}.

\subsection{Reluctance in opening data}
\label{subsec:reluctanceopendata}

Despite of its remarkably high potential, the value of open data has not been realized optimally. Although some EU countries are making their public sector information available on \textit{``transparent, effective and non-discriminatory terms''} other countries are reluctant to make their data publicly available \citep[p. 446]{JANSSEN}. \cite{FIORETTI} identifies some of the reasons behind this reluctance \citep[pp. 22-23]{FIORETTI},

\begin{itemize}

\item Lack of awareness about the importance and benefits of open data and lack of guidance and rules about the reuse of the data from the upper levels and fear to lose control acts as a motivation for doing nothing.

\item Legal barriers or serious confusion about the legal status of data.

\item Fear of embarrassment deriving from publishing low quality data.

\item Lack of data maturity or low quality data.  
 
\end{itemize}


\noindent To promote the dissemination of public sector information the European Commission adopted a PSI directive whose main aim was to encourage member states to open up data they created \citep[pp. 446-447]{JANSSEN2}. According to \cite{JANSSEN2}, the PSI directive has been unable to solve the issues regarding access and re-use of open data. \cite{JANSSEN2} also emphasizes that the development of integrative information policies can catalyse the opening of data \citep[p.454]{JANSSEN2}.  \bigskip

\noindent Researchers argue that the reluctance in opening data is mainly seen among the public sector \citep{LEE}. This also might be the result of \textit{``silo trap: the inability of government bodies to share information and collaborate effectively across organisational boundaries} \citep[p. 90]{BASON}. \cite{ROBINSON} describe how federal rules prevent public bodies to keep the same pace as private organisations. \bigskip

\noindent As described in section \ref{subsec:relevanceofopendata}, both the private and public sectors are trapped into silo thinking. There is a need of \textit{maturing} the these actors in regards to developing a collective mindset of opening data. This requires us to understand the concept of maturity and how it works.

\subsection{Open Data Technologies}

We find it necessary to briefly explain how data can be opened from a technology point of view. We will not go into depth with all the different mechanisms and protocols but just point to literature that explains some of the most widely used technologies. \bigskip

\noindent Opening and sharing of data inherently involves the Internet as a sharing platform for structuring, publishing, finding, and exploiting certain information \citep{AUER,HOFMANN}.

Data can be structured in the most simple way, such as in online PDF and text files. This type of static data is not very flexible in its structure and difficult to make dynamic, but in terms of secure communication it is very effective (Appendix \ref{appsec:henrik}). The municipality of Copenhagen is publishing lots of data about the city on their website, \url{http://data.kk.dk}, mostly in CSV, JSON, and PDF files. It varies how often the data is updated but the most frequently updated datasets are updated every 24 hours (e.g. \url{http://data.kk.dk/dataset/udlansdata}. If the data is updated then it might be considered as dynamic but should not be confused with \textit{real-time} data from sensors, data robots, other data collection mechanisms\footnote{\url{http://www.investopedia.com/terms/r/real_time.asp}}. \bigskip

\noindent One way to use open data as a platform for bringing together several groups of users and customers is through the use of open Application Programming Interfaces (APIs). API platforms have the possibility to reduce the transaction costs between its developers, customers, and end users \citep[p. 25]{MULLIGAN}. A Web Service is a type of API that typically operates over HTTP but also over other communication protocols like SMTP (in the case of SOAP Web services)\footnote{\url{http://www.w3.org/TR/ws-gloss/}}.

\begin{figure}[H]
  \centering
    \includegraphics[width=0.75\textwidth]{./Pictures/webservicemodel}
    \caption{\textit{The Web Services Model} (\url{https://support.novell.com/techcenter/articles/dnd20030304.html})}
    \label{fig:webservicemodel}
\end{figure}

\begin{figure}[H]
  \centering
    \includegraphics[width=1\textwidth]{./Pictures/servicecomponents}
    \caption{\textit{Components of the web services architecture} \citep[p. 247]{HOFMANN}}
    \label{fig:servicecomponents}
\end{figure}

\noindent The World Wide Web Consortium (W3C) defines a Web service as \textit{``a software system designed to support interoperable machine-to-machine interaction over a network. The interface to a specific Web service is described in a machine-processable format called Web Services Definition Language [WSDL]''} \citep[p. 247]{HOFMANN}. In figure \ref{fig:webservicemodel} a standard Web service model shows the web elements \textit{``that interwork to make up this Web-based paradigm for announcing, discovering, describing, locating, and exchanging messages to use the services''} \citep[p. 247]{HOFMANN}. This model and modifications hereof can handle real-time data consumption and communication which relies on common open standards. We will not give an in-depth explanation of the standard Web service model but only point to the explanation given by \cite{HOFMANN} in figure \ref{fig:servicecomponents}. \bigskip

\noindent \cite{BERNERSLEE} emphasize that \textit{``raw dumps in formats such as CSV or XML sacrifices much of [the] structure and semantics [of the Web]''}. Linked data is a means to link, combine, and aggregate data entities on the Web \citep[p. 1]{BERNERSLEE}:

\begin{quote}
\textit{``In summary, Linked Data is simply about using the Web to create typed links between data from different sources. These may be as diverse as databases maintained by two organisations that, historically, have not easily interoperated at the data level. Technically, Linked Data refers to data published on the Web in such a way that it is machine-readable, its meaning is explicitly defined, it is linked to other external data sets, and can in turn be linked to and from external data sets.''} \citep[p. 2]{BERNERSLEE}
\end{quote}

\noindent The Linked Data principles can be used as a standard for open data and harvesting the value of data composition and combination. \cite{BERNERSLEE} points to the Linked Open Data Project as a concrete example.

